Optimal Search for Minimum Error Rate Training

نویسندگان

  • Michel Galley
  • Chris Quirk
چکیده

Minimum error rate training is a crucial component to many state-of-the-art NLP applications, such as machine translation and speech recognition. However, common evaluation functions such as BLEU or word error rate are generally highly non-convex and thus prone to search errors. In this paper, we present LP-MERT, an exact search algorithm for minimum error rate training that reaches the global optimum using a series of reductions to linear programming. Given a set of N -best lists produced from S input sentences, this algorithm finds a linear model that is globally optimal with respect to this set. We find that this algorithm is polynomial in N and in the size of the model, but exponential in S. We present extensions of this work that let us scale to reasonably large tuning sets (e.g., one thousand sentences), by either searching only promising regions of the parameter space, or by using a variant of LP-MERT that relies on a beam-search approximation. Experimental results show improvements over the standard Och algorithm.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Minimum Error Rate Training and the Convex Hull Semiring

We describe the line search used in the minimum error rate training algorithm (Och, 2003) as the “inside score” of a weighted proof forest under a semiring defined in terms of wellunderstood operations from computational geometry. This conception leads to a straightforward complexity analysis of the dynamic programming MERT algorithms of Macherey et al. (2008) and Kumar et al. (2009) and practi...

متن کامل

Supervised Training Using Global Search Methods

Supervised learning in neural networks based on the popular backpropagation method can be often trapped in a local minimum of the error function. The class of backpropagation-type training algorithms includes local minimization methods that have no mechanism that allows them to escape the influence of a local minimum. The existence of local minima is due to the fact that the error function is t...

متن کامل

Global Search Methods for Neural Network Training

In many cases the supervised neural network training using a backpropagation based learning rule can be trapped in a local minimum of the error function. These training algorithms are local minimization methods and have no mechanism that allows them to escape the in uence of a local minimum. The existence of local minima is due to the fact that the error function is the superposition of nonline...

متن کامل

Rate-based versus distortion-based optimal error protection of embedded codes

We consider a joint source-channel coding system that protects an embedded bitstream using a finite family of channel codes with error detection and error correction capability. The performance of this system may be measured by the expected distortion or by the expected number of correctly decoded source bits. Whereas a rate-based optimal solution can be found in linear time, the computation of...

متن کامل

Optimal reference subset selection for nearest neighbor classification by tabu search

This paper presents an approach to select the optimal reference subset (ORS) for nearest neighbor classi$er. The optimal reference subset, which has minimum sample size and satis$es a certain resubstitution error rate threshold, is obtained through a tabu search (TS) algorithm. When the error rate threshold is set to zero, the algorithm obtains a near minimal consistent subset of a given traini...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011